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SYSTEM FOR CONTROLLING A POSITION OF A MASS 



BACKGROUND OF THE INVENTION 

1. Priority Documentation 

[0001] This application claims priority from European Patent Application 
No. 03075660.5, filed March 6, 2003, herein incorporated by reference in its entirety. 

2. Field of the Invention 

[0002] The present invention relates to a control system that controls a position of 
a mass, and in particular, to control a position of a moving component of a lithographic 
apparatus. 

3. Description of the Related Art 

[0003] A lithographic apparatus is a machine that applies a desired pattern onto a 
target portion of a substrate. Lithographic apparatus can be used, for example, in the 
manufacture of integrated circuits (ICs). In that circumstance, a patterning device, such 
as a mask, may be used to generate a circuit pattern corresponding to an individual 
layer of the IC, and this pattern can be imaged onto a target portion (e.g. comprising 
part of, one or several dies) on a substrate (e.g. a silicon wafer) that has a layer of 
radiation-sensitive material (resist). 

[0004] The term "patterning device" as employed herein should be broadly 
interpreted as referring to a mechanism that can be used to endow an incoming 
radiation beam with a patterned cross-section, corresponding to a pattern that is to be 
created in a target portion of the substrate; the term "light valve" can also be used in 
this context. Generally, the pattern will correspond to a particular functional layer in a 
device being created in the target portion, such as an integrated circuit or other device 
(see below). Examples of such a patterning device include: 

[0005] a mask: the concept of a mask is well known in lithography, and it 
includes mask types such as binary, alternating phase-shift, and attenuated 
phase-shift, as well as various hybrid mask types. Placement of such a mask in 
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the radiation beam causes selective transmission (in the case of a transmissive 
mask) or reflection (in the case of a reflective mask) of the radiation impinging 
on the mask, according to the pattern on the mask. In the case of a mask, the 
support structure will generally be a mask table, which ensures that the mask 
can be held at a desired position in the incoming radiation beam, and that it can 
be moved relative to the beam if so desired; 

[0006] programmable mirror array: an example of such a device is a matrix- 
addressable surface having a visco-elastic control layer and a reflective surface. 
The basic principle behind such an apparatus is that (for example) addressed 
areas of the reflective surface reflect incident light as diffracted light, whereas 
unaddressed areas reflect incident light as undiffracted light. Using an 
appropriate filter, the said undiffracted light can be filtered out of the reflected 
beam, leaving only the diffracted light behind; in this manner, the beam 
becomes patterned according to the addressing pattern of the matrix-addressable 
surface. The required matrix addressing can be performed using suitable 
electronic means. More information on such mirror arrays can be gleaned, for 
example, from United States Patent Nos. 5,296,891 and 5,523,193, which are 
incorporated herein by reference. In the case of a programmable mirror array, 
the said support structure may be embodied as a frame or table, for example, 
which may be fixed or movable as required; and 

[0007] programmable LCD array: an example of such a construction is 
given in United States Patent No. 5,229,872, which is incorporated herein by 
reference. As above, the support structure in this case may be embodied as a 
frame or table, for example, which may be fixed or movable as required. 

[0008] For purposes of simplicity, the rest of this text may, at certain locations, 
specifically direct itself to examples involving a mask and mask table; however, the 
general principles discussed in such instances should be seen in the broader context of 
the patterning device as set forth above. 
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[0009] In general, a single substrate will contain a network of adjacent target 
portions that are successively exposed. Known lithographic apparatus include so-called 
steppers, in which each target portion is irradiated by exposing an entire pattern onto 
the target portion in one go, and so-called scanners, in which each target portion is 
irradiated by scanning the pattern through the projection beam in a given direction (the 
"scanning"-direction) while synchronously scanning the substrate parallel or 
anti-parallel to this direction. Because, typically, the projection system will have a 
magnification factor M (generally < 1), the speed V at which the substrate table is 
scanned will be a factor M times that at which the mask table is scanned. More 
information with regard to lithographic devices as here described can be gleaned, for 
example, from United States Patent No. 6,046,792, incorporated herein by reference. 

[00010] In a manufacturing process using a lithographic projection apparatus, the 
pattern is imaged onto a substrate that is at least partially covered by a layer of 
radiation-sensitive material (resist). Prior to this imaging step, the substrate may 
undergo various procedures, such as priming, resist coating and a soft bake. After 
exposure, the substrate may be subjected to other procedures, such as a post-exposure 
bake (PEB), development, a hard bake and measurement/inspection of the imaged 
features. This array of procedures is used as a basis to pattern an individual layer of a 
device, e.g. an IC. Such a patterned layer may then undergo various processes such as 
etching, ion-implantation (doping), metallization, oxidation, chemo-mechanical 
polishing, etc., all intended to finish off an individual layer. 

[00011] If several layers are required, then the whole procedure, or a variant thereof, 
will have to be repeated for each new layer. Eventually, an array of devices will be 
present on the substrate (wafer). These devices are then separated from one another by 
a technique such as dicing or sawing, whence the individual devices can be mounted on 
a carrier, connected to pins, etc. Further information regarding such processes can be 
obtained, for example, from the book "Microchip Fabrication: A Practical Guide to 
Semiconductor Processing", Third Edition, by Peter van Zant, McGraw Hill Publishing 
Co., 1997, ISBN 0-07-067250-4, incorporated herein by reference. 
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[00012] For the sake of simplicity, the projection system may hereinafter be referred 
to as the "lens"; however, this term should be broadly interpreted as encompassing 
various types of projection system, including refractive optics, reflective optics, and 
catadioptric systems, for example. The radiation system may also include components 
operating according to any of these design types for directing, shaping or controlling 
the projection beam of radiation, and such components may also be referred to below, 
collectively or singularly, as a "lens". Further, the lithographic apparatus may be of a 
type having two or more substrate tables (and/or two or more mask tables). In such 
"multiple stage" devices the additional tables may be used in parallel, or preparatory 
steps may be carried out on one or more tables while one or more other tables are being 
used for exposures. Twin stage lithographic apparatus are described, for example, in 
United States Patent No. 5,969,441 and WO 98/40791, incorporated herein by 
reference. 

[00013] Regarding the lithographic apparatus, a substrate table supporting a 
substrate is moved in an XY operating region by actuators controlled by a controller. It 
has been observed that the controller error behavior directly after accelerating to a 
constant velocity depends on the position of the substrate table in the XY operating 
region. It also depends on the exact mass that is to be moved, which varies due to, for 
example, variations in the mass of the substrate. 

SUMMARY OF THE INVENTION 

[00014] Principles of the present invention, as embodied and broadly described 
herein, provide for a controller configured to control a position of a mass by providing 
the mass a mass acceleration by a control force depending on a desired mass 
acceleration. The invention is especially applicable in controlling a position of a 
substrate table or a mask table in a lithographic apparatus. In one embodiment, 
controller is arranged to receive a feedback signal comprising status information of the 
mass, to calculate an estimated relation between the mass acceleration and said control 
force from said feedback signal and from said control force, and to use said estimated 
relation and said desired mass acceleration to determine said control force. Preferably, 
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the status information comprises an indication of the position, the speed and/or the 
acceleration of the mass. In a particular embodiment the feedback signal is a feedback 
acceleration signal of the mass (the acceleration signal being measured or calculated). 

[00015] In such a controller, feed-forward of the acceleration force adapts to 
position-dependent behavior and mass variations, and the controller error becomes 
smaller and less dependent on the position of the table in its operating region. 

[00016] In an embodiment, the estimated relation may be an estimated mass. This 
embodiment is applicable where the mass is operating as a rigid body. 

[00017] In an embodiment, the controller may be arranged to calculate at least one of 
an estimated velocity coefficient, an estimated jerk coefficient, and an estimated snap 
coefficient from the feedback position signal and the control force, with the aim of 
either creating a better estimate of the mass, and/or to use at least one of the estimated 
velocity coefficient, the estimated jerk coefficient, and the estimated snap coefficient to 
partly determine the control force. This may further improve the accuracy of the 
controller. 

[00018] In another embodiment, the controller is arranged to calculate estimated filter 
coefficients of a general filter structure, such that the resulting filter describes the 
relation between the acceleration of the mass and the applied control force. The 
controller is further arranged to use the estimated filter coefficients and the desired 
mass acceleration to partly determine the control force. 

[00019] In a further embodiment, the invention relates to a lithographic projection 
apparatus comprising a radiation system for providing a projection beam of radiation, a 
support structure for supporting patterning device, the patterning device serving to 
pattern the projection beam according to a desired pattern, a substrate table for holding 
a substrate, a projection system for projecting the patterned beam onto a target portion 
of the substrate, and a controller as defined above, the mass being a movable object in 
the lithographic apparatus. 
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[00020] The invention also relates to a method of controlling a position of a mass by 
providing the mass a mass acceleration by a control force depending on a desired mass 
acceleration characterized by receiving a feedback position signal indicating a position 
of the mass, calculating an estimated relation between the mass acceleration and the 
control force from the feedback position signal and from the control force and using the 
estimated relation and the desired mass acceleration to determine the control force. 

[00021] In a further embodiment, the invention uses such a method in a device 
manufacturing method comprising providing a substrate that is at least partially covered 
by a layer of radiation-sensitive material, providing a projection beam of radiation 
using a radiation system, using patterning device to endow the projection beam with a 
pattern in its cross-section, projecting the patterned beam of radiation onto a target 
portion of the layer of radiation-sensitive material, and controlling a position of the 
mass, the mass being at least one of the substrate holder with the substrate and the 
support structure with the patterning device. 

[00022] The projection system as described above usually comprises one or more, for 
instance six, projection devices, such as lenses and/or mirrors. The projection devices 
transmit the projection beam through the projection system and direct it to the target 
portion. In case the projection beam is EUV-radiation, mirrors should be used instead 
of lenses, in order to project the projection beam, since lenses are not translucent to 
EUV-radiation. 

[00023] When an extreme ultraviolet projection beam is used for projecting relatively 
small patterns, the demands for the projection system concerning the accuracy are 
rather high. For instance, a mirror, that is positioned with a tilting error of 1 nm, can 
result in a projection error of approximately 4 nm on the wafer. 

[00024] A projection system for projecting an extreme ultraviolet projection beam 
comprises, for instance, 6 mirrors. Usually, one of the mirrors has a fixed spatial 
orientation, while the other five are mounted on a Lorentz actuated mount. These 
mounts can preferably adjust the orientation of the mirrors in 6 degrees of freedom (6- 
DoF-mounts) using 6 degrees Lorentz engines per mirror. The projection system 
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further comprises sensors for measuring the spatial orientation of the projection 
devices. 

[00025] The projection system is mounted to the fixed world, for instance, a metro 
frame, using a 30 Hz mounting device. This is done to stabilize the projection beam and 
isolate it from vibrations and disturbances coming from the environment, such as 
adjacent systems. As a result of this mounting, unwanted disturbances above the 30 Hz 
are almost completely filtered out. However, disturbances having a frequency of 
approximately 30 Hz, are not stopped by the mounting device and are even amplified. 

[00026] Although specific reference may be made in this text to the use of the 
apparatus according to the invention in the manufacture of ICs, it should be explicitly 
understood that such an apparatus has many other possible applications. For example, it 
may be employed in the manufacture of integrated optical systems, guidance and 
detection patterns for magnetic domain memories, liquid-crystal display panels, 
thin-film magnetic heads, etc. The skilled artisan will appreciate that, in the context of 
such alternative applications, any use of the terms "reticle", "wafer" or "die" in this text 
should be considered as being replaced by the more general terms "mask", "substrate" 
and "target portion", respectively. 

[00027] In the present document, the terms "radiation" and "beam" are used to 
encompass all types of electromagnetic radiation, including ultraviolet (UV) radiation 
(e.g. with a wavelength of 365, 248, 193, 157 or 126 nm) and extreme ultra-violet 
(EUV) radiation (e.g. having a wavelength in the range 5-20 nm), as well as particle 
beams, such as ion beams or electron beams. 
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BRIEF DESCRIPTION OF THE DRAWINGS 

[00028] Embodiments of the invention will now be described, by way of example 
only, with reference to the accompanying schematic drawings in which: 

[00029] Figure 1 is a schematic general overview of a lithographic projection 
apparatus; 

[00030] Figure 2 shows a control architecture according to the state of the art; 

[00031] Figure 3 shows a basic scheme for on-line mass estimation in accordance 
with the invention; 

[00032] Figure 4 shows a circuit for generation of an estimation error; 

[00033] Figure 5 shows several curves for mass estimation on a wafer stage. The 
curves include a set point acceleration curve, a mass estimation curve, a mass 
estimation curve using high-pass filters, and a mass estimation using an offset 
estimation; 

[00034] Figure 6 shows an example of mass estimation and feed-forward; 

[00035] Figure 7 shows curves for the control error without and with mass estimation 
feed-forward; 

[00036] Figure 8 shows a circuit that can be used to estimate velocity, acceleration, 
jerk and snap feed-forward; 

[00037] Figure 9 shows curves for mass estimation when various other parameters 
are estimated; 

[00038] Figure 10 shows curves for estimations of velocity, acceleration, jerk and 
snap components; 

[00039] Figure 1 1 shows a feed forward circuit for mass estimation and snap feed- 
forward; 
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[00040] Figure 12a shows the controller error with and without snap feed forward, 
whereas Figure 12b shows a portion of Figure 12a on an enlarged time scale; 

[00041] Figure 13 shows mass estimation with and without snap feed-forward; 

[00042] Figure 14a shows the controller error with and without snap feed-forward 
when using 10 Hz high-pass filters for the estimation, whereas Figure 14b shows a 
portion of Figure 14a on an enlarged time scale; 

[00043] Figure 15 shows mass estimation using 10 Hz high-pass filters, without and 
with snap feed-forward; 

[00044] Figure 16 shows mass estimation with 0.1 samples less delay in force path; 

[00045] Figure 17 shows several curves for the exposed chuck during a negative 
move; 

[00046] Figure 18 shows several curves for the exposed chuck during a positive 
move; 

[00047] Figure 19 shows several curves for the measure chuck during a negative 
move; 

[00048] Figure 20 shows several curves for the measure chuck during a positive 
move; 

[00049] Figure 21 shows the effect of switching on the estimator only at maximum 
acceleration; 

[00050] Figure 22 shows an alternative least-square estimation for 1 parameter; 

[00051] Figure 23 shows an alternative least-square estimation for 1 parameter with 
forgetting factor; 

[00052] Figure 24 shows a simplified offset estimation scheme; 
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[00053] Figure 25 shows an ARX filter structure for a simplified implementation to 
find the optimal estimation mass; 

[00054] Figure 26 shows a FIR filter architecture taking a forgetting factor into 
account; 

[00055] Figure 27 shows the magnitude and phase of an estimated transfer function 
in an alternative approach; the situation is shown for both a FIR filter and an ARX 
filter; 

[00056] Figure 28 shows the feed-forward for the FIR and ARX filters used in 
Figure 27; 

[00057] Figure 29 shows the estimated mass for the situation corresponding with 
Figures 27 and 29; 

[00058] Figures 30, 31 and 32 are showing curves which are similar as the curves in 
Figures 27, 28 and 29, respectively, be it that the filters used are of a much lower order; 
and 

[00059] Figure 33 shows a circuit architecture with on-line feed-forward estimation. 

[00060] In the Figures, corresponding reference symbols indicate corresponding 
parts. 
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DETAILED DESCRIPTION 

[00061] Figure 1 schematically depicts a lithographic projection apparatus according 
to a particular embodiment of the invention. The apparatus comprises: 

[00062] a illumination system Ex, IL: for supplying a beam PB of 
radiation (e.g. EUV radiation with a wavelength of 11-14 nm). In this 
particular case, the radiation system also comprises a radiation source LA; 

[00063] a first object table (mask table or holder) MT: provided with a 
mask holder for holding a mask MA (e.g. a reticle), and connected to first 
positioning mechanism PM for accurately positioning the mask with 
respect to item PL; 

[00064] a second object table (substrate table or substrate holder) WT\ 
provided with a substrate holder for holding a substrate W (e.g. a 
resist-coated silicon wafer), and connected to second positioning 
mechanism PW for accurately positioning the substrate with respect to 
item PL; and 

[00065] a projection system ("lens") PL: for imaging an irradiated 
portion of the mask MA onto a target portion C (e.g. comprising one or 
more dies) of the substrate W. 

[00066] As here depicted, the apparatus is of a reflective type (i.e. has a reflective 
mask). However, in general, it may also be of a transmissive type, for example (with a 
transmissive mask). Alternatively, the apparatus may employ another kind of patterning 
device, such as a programmable mirror array of a type as referred to above. 

[00067] The source LA (e.g. a laser-produced plasma or a discharge plasma EUV 
radiation source) produces a beam of radiation. This beam is fed into an illumination 
system (illuminator) IL, either directly or after having traversed conditioning means, 
such as a beam expander Ex, for example. The illuminator IL may comprise adjusting 
means AM for setting the outer and/or inner radial extent (commonly referred to as a- 
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outer and a-inner, respectively) of the intensity distribution in the beam. In addition, it 
will generally comprise various other components, such as an integrator IN and a 
condenser CO. In this way, the beam PB impinging on the mask MA has a desired 
uniformity and intensity distribution in its cross-section. 

[00068] It should be noted with regard to Figure 1, that the source LA may be within 
the housing of the lithographic projection apparatus (as is often the case when the 
source LA is a mercury lamp, for example), but that it may also be remote from the 
lithographic projection apparatus, the radiation beam which it produces being led into 
the apparatus (e.g. with the aid of suitable directing mirrors); this latter scenario is often 
the case when the source LA is an excimer laser. The current invention and claims 
encompass both of these scenarios. 

[00069] The beam PB subsequently intercepts the mask MA, which is held on a mask 
table MT. Having traversed the mask MA, the beam PB passes through the lens PL, 
which focuses the beam PB onto a target portion C of the substrate W. With the aid of 
the second positioning mechanism PW (and interferometric measuring means IF), the 
substrate table WT can be moved accurately, e.g. so as to position different target 
portions C in the path of the beam PB. Similarly, the first positioning mechanism PM 
can be used to accurately position the mask MA with respect to the path of the beam 
PB, e.g. after mechanical retrieval of the mask MA from a mask library, or during a 
scan. 

[00070] In general, movement of the object tables MT, WT will be realized with the 
aid of a long-stroke module (coarse positioning) and a short-stroke module (fine 
positioning), which are not explicitly depicted in Figure 1. However, in the case of a 
wafer stepper (as opposed to a step-and-scan apparatus) the mask table MT may just be 
connected to a short stroke actuator, or may be fixed. Mask MA and substrate W may 
be aligned using mask alignment marks Ml, M2 and substrate alignment marks PI, P2. 
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[00071] The depicted apparatus can be used in different modes: 

[00072] step mode: the mask table MT is kept essentially stationary, and 
an entire mask image is projected in one go (i.e. a single "flash") onto a 
target portion C. The substrate table WT is then shifted in the x and/or y 
directions so that a different target portion C can be irradiated by the beam 
PB; 

[00073] scan mode: essentially the same scenario applies, except that a 
given target portion C is not exposed in a single "flash". Instead, the mask 
table MT is movable in a given direction (the so-called "scan direction", 
e.g. the y direction) with a speed v, so that the projection beam PB is 
caused to scan over a mask image; concurrently, the substrate table WT is 
simultaneously moved in the same or opposite direction at a speed V = 
Mv, in which M is the magnification of the lens PL (typically, M = 1/4 or 
1/5). In this manner, a relatively large target portion C can be exposed, 
without having to compromise on resolution; 

[00074] other mode: the mask table MT is kept essentially stationary 
holding a programmable patterning means, and the substrate table WT is 
moved or scanned while a pattern imparted to the projection beam is 
projected onto a target field C. In this mode, generally a pulsed radiation 
source is employed and the programmable patterning means is updated as 
required after each movement of the substrate table WT or in between 
successive radiation pulses during a scan. This mode of operation can be 
readily applied to maskless lithography that utilizes programmable 
patterning means, such as a programmable mirror array of a type as 
referred to above. 
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First Embodiment 

[00075] The accuracy of positioning the mask and wafer stages is depends, in part, on 
the accuracy of the set point feed- forward. Currently, only a set point acceleration feed- 
forward using the (calibrated) mass is used, as depicted in Figure 2. It is observed that 
Figure 2 is a simplified picture, without decoupling, etc. Moreover, it is observed that, 
below, the invention will be illustrated with reference to the wafer stage (i.e., the wafer 
table WT, together with a wafer W), but it should be understood that the invention is 
equally applicable with the mask stage. 

[00076] Figure 2 shows a control system 3 for a wafer stage (WS) 12, as may be 
implemented by software in any type of computer arrangement (not shown) in the 
second positioning mechanism PW (Figure 2). Such a computer arrangement may 
comprise a processor, a single computer, or a plurality of computers acting in co- 
operation. However, alternatively, any suitable type of analog and/or digital circuits 
may be used as well, as will be evident to persons skilled in the art. In that sense, 
Figure 2 (and the other architecture Figures described here) only show functional 
modules that can be implemented in many different ways. Apart from the wafer stage 
12, all of the components shown in Figure 2, can be thought of as being included in the 
second positioning mechanism PW. 

[00077] The control system 3 comprises a comparator 16 that receives a position set 
point signal for wafer stage 12 and an actual position signal from the wafer stage 12 
that is fed back to the comparator 16. Based on the comparisons of the set point signal 
and the actual position signal, the comparator generates a positional track error signal 
that is output to a control unit 2, such as, for example, a Proportional-Integral- 
Derivative (PID) control unit. 

[00078] PID control unit 2 is configured to provide a signal that is proportional to the 
force that will be applied to wafer stage 12 by the control system 3. The output of PED 
control unit 2 is connected to a first notch filter 4. The first notch filter 4 outputs a 
signal to a first adder unit 6 that has its output connected to a second notch filter 8. The 
first adder unit 6 also receives a force set point signal and adds this to the output of the 

_14- 

30428037-1 



Docket No.; P-1520.010-EP 



first notch filter 4. The force set point signal is delivered as output signal from a 
multiplier unit 14 that multiplies a received acceleration set point signal by a mass mfr, 
i.e., the feed- forward mass of the wafer stage 12. The use of mass mff provides the 
ability of being able to move the wafer stage 12 from one position to another while 
minimizing the possibility of exacerbating the tracking positional error. 

[00079] The second notch filter 8 has its output connected to a second adder unit 10. 
The second adder unit 10 can also receive additional feed- forward signals. The second 
adder unit 10 outputs the addition of its two input signals as a position control force to 
the wafer stage 12 to minimize the output of comparator 16 (the controller error). 

[00080] It will be appreciated that the architecture of Figure 2 is capable of 
controlling both a desired position of the wafer stage 12 and a translation to a new 
position via a controlled acceleration in an XY operating region of the wafer stage 12. 
While' the set point acceleration equals the second derivative of the set point position, 
the feed-forward force generated by multiplier unit 14 results in a movement that 
already closely matches the set point position. The PID control unit 2 therefore only 
takes care of remaining deviations between the actual wafer stage path and that dictated 
by its set point position. 

[00081] It has been observed that the controller error behavior (output of comparator 
16) is dependent on the position of the wafer stage 12: the error is larger in the corners 
of the XY operating region. Note that a decoupling (gain balancing) matrix, that 
incorporates all gain effects like the mass, amplifier gain, motor constant (of an 
actuator, not shown, to translate the wafer stage 12), etc., is calibrated only in the centre 
of the operating region. The feed-forward mass mfr, as depicted in Figure 2, together 
with the accompanying delay correction, is also calibrated in the centre of the operating 
region. 

[00082] One possible explanation of the position-dependent behavior lies in a 
changing 'gain' of the physical wafer stage 12, caused by one of the above-mentioned 
effects: mass, amplifier gain, motor force constant. Here, on-line mass estimation is 
proposed as a method to compensate for this effect. Besides position-dependence, other 
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time-varying effects are countered as well, like ageing of components (amplifier, motor 
of actuator, etc.), or changed behavior due to heating of components (amplifiers, 
motors). Also, variations in the mass of the substrates that are subsequently exposed are 
countered. 

[00083] The basic idea of on-line mass estimation is to continuously estimate the 
mass of the wafer stage 12 from the wafer stage input force and the resulting position 
change, i.e., from the output signals from the second adder unit 10 and the wafer stage 
12. The estimated mass is used to modify the feed- forward coefficient mfr as used in the 
multiplier 14. If the estimation is fast enough, position-dependent behavior can be 
captured. 

[00084] Note that not only the wafer stage mass is estimated but all aspects that 
influence the gain from the wafer stage input to its output are included in the 
estimation. Note also that, because only the feed-forward is adjusted, and the controller 
gain is left unchanged, the instability risk is minimized. 

[00085] The basic architecture of the on-line estimation according to the invention is 
shown in Figure 3. In Figure 3, the same reference numerals as in Figure 2 refer to the 
same components. In addition to Figure 2, a mass estimation unit 18 is used that 
receives the output signals from the second adder unit 10 and the wafer stage 12 as 
input signals. From those input signals, it calculates a mass estimation signal mest that is 
output to the multiplier 14. The mass estimation unit 18 endeavors to estimate the 
actual mass of wafer stage 12 as different substrate wafers W may vary in weight. 
Also, the actual force required to move wafer stage 12 depends on characteristics of the 
amplifiers, actuators, etc., which may vary due to component aging. 

[00086] It is observed that, below, it will be assumed that the mass estimation unit 18 
receives an input signal as to position x directly from the wafer stage 12. However, 
alternatively, mass estimation unit 18 may receive input signals from the output of 
other units, e.g., comparator 16 or PID unit 2, to calculate the mass estimation me St . 
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[00087] Now, the calculations made by the 'mass estimation 1 unit 18 will be 
explained. The main idea is to estimate the mass from the relation: 

[00088] F = m a 

[00089] The force F is generated by the controller, and is hence already available as 
output signal from second adder unit 10. The actual acceleration a, however, must be 
deducted from the actual position signal x received from the wafer stage 12, e.g., using 
a digital double differentiator 22 (Figure 4): 



[00091] This double differentiator 22, however, suffers from 1 sample delay, which 
causes the force F to be one sample advanced with respect to acceleration a . For this 
reason, F must be delayed 1 sample as well. In addition, the force that is actuated will 
stay on an output of a Digital Analog Converter (not shown) for one sample, 
introducing another 0.5 sample delay. Further, the Input Output delay (calculation time 
of the motion controller computer) influences the time shift between the force and the 
acceleration. For this reason, the force must be delayed by a total of 2.35 samples 
(determined from measurements on an actual system). To that end, a delay unit 20 is 
introduced, see Figure 4. 

[00092] The least-squares method is used to estimate the mass. In general, the least- 
squares method estimates parameters of a model of which the output is linear in its 
parameters. In this case, the model simply generates the estimated force F , which 
equals the measured acceleration multiplied by the estimated mass m : 

[00093] F = ma 

[00094] The estimated force F is subtracted from the actual force F to generate an 
estimation error e: 



[00090] 




s 



[00095] e = F-F = F-ma 
30428037-1 
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[00096] This estimation error e is one of the variables used by the least-squares 
method to create its mass estimation m (see following paragraph). The resulting block 
diagram is shown in Figure 4 which shows the double differentiator 22 receiving the 
actual position signal from the wafer stage 12 and outputting acceleration a to a 
multiplier 24 that multiplies acceleration a by the estimated mass m . 

[00097] In general, the least-squares method is a method to estimate parameters from 
input/output data. Here, only the recursive least-squares method is described because at 
each sample, a new estimation must be generated. This is in contrast with the situation 
where all data is available beforehand, and an estimation has to be made only once. 

[00098] A model output is generated that is the multiplication of a Signal vector' co 
and the previously estimated parameter vector 6 : 

[00099] y(k) = 0 T (k-l)cD(k). 

[000100] The difference between the real output y(k) and the estimated output y(k) 
serves as estimation error: e(k) = y(k) - y(k) . This estimation error e(k) is used to update 
the estimated parameter vector: 

[000101] 3(k) = 6{k - \)+T{k)cD(k)e(k) 

[000102] Here, the 'adaptation gain matrix' r is updated every sample: 



[000104] where: A = "forgetting factor" (see further below) 

[000105] In the case of the mass estimation discussed above, all of the above equations 
reduce to scalar equations: 




r(k - \)co{k)co T (k)r(k - i) 

X + o) T {k)T{k-\)co{k) 
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[000106] v ' X 



y{k) = F{k) 
y(k) = m(k-l)a(k) 
co{k)=a{k) 
0(k) = m{k) 

e(k) = y(k) - y{k) = F(k) - m(k - l)a{k) 

r(k) =l\r(k-\)- r(*-iM*K(*)r(*-i) 



Z + a> T (k)r(k-\)a)(k) 

rft-i)- 

V X + Y{k-\)a 2 {k) 

rfr-i) 



A + a 2 (A:)r(A:-l) 
ro(jfc) = w(* - 1) + r^Ja^X^W - m(k - l)a(k)] 
[000107] Hence, in each sample, the following steps are taken: 

[000108] Determine the current value of the signal vector co(k). In the case of mass 
estimation, this equals the current value of the acceleration a(k) . 

[000109] Determine the model output, based on the previous estimation of the 
parameter vector: 6{k-\) and the current signal vector a>(k). In the case of the mass 
estimation, this equals the product of the previous mass estimate and the current value 
of the measured acceleration. 

[000110] From the model output, and the actual output (in the case of the mass 
estimation: the current value of the force), the estimation error e(k) can be calculated. 

[000111] The adaptation gain matrix r(k) is calculated using the above recursion 
equation. In the case of the mass estimation, where only one parameter is estimated, 
this is a scalar equation. 

[000112] The parameter estimate is updated by using r(/r), co(k) and e(k). This 
produces a new mass estimate. 
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[000113] The parameter X denotes the 'forgetting factor'. If it is 1, the recursive least- 
squares method produces exactly the same output as the non-recursive version. This 
means that no parameter updates will be produced any more after a long period of time. 
To keep the method adapting the estimates, a value slightly below 1 should be chosen. 
In practice, a value of 0.995 appears to be fine. A larger value slows down the 
estimation, a smaller value introduces noise in the estimation. 

[000114] Note that in the case of estimating other parameters than the mass, only the 
definition of the signal vector and the parameter vector need to be changed. 

[000115] Note also that when more parameters are estimated, the numerical 
complexity increases quadratically because matrix calculations are involved. 

[000116] A complication is the presence of offset on the control force. This is 
introduced by DACs, amplifiers, or even a tilted stone introducing a gravity 
component. While the acceleration is zero, still some force value is 'actuated 1 . If no 
other forces are present, the estimator interprets this effect as an infinite mass because 
the control force results in zero acceleration. 

[000117] One possibility to tackle this is to estimate the offset as a second parameter. 
For this, the controller can be provided with an offset estimator which estimates the 
offset such that the estimated offset can for example subsequently be subtracted from 
the total mass estimation. However, this strategy may not work properly due to the 
excitation format (the acceleration profile has relatively long periods of constant 
acceleration), the least-squares method cannot properly distinguish between offset and 
mass. In other words, the offset estimation actually results in disturbance of the mass 
estimation. 

[000118] For this reason, a simpler solution is chosen: both the actuated force and the 
acceleration are filtered by a high-pass filter. For a start, the time constant is set at 1 
Hz. Figure 5 shows the results of the mass estimation in the original situation, and the 
effects of either having an additional offset estimation or high-pass filters. 
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[0001191 It can be seen that if the offset is not taken into account, the estimated mass 
at a position around y=+150 mm is approximately 22.62 kg, while the estimated mass 
at y=-150 mm is about 22.47 kg. This very large difference of 150 g is not visible any 
more if either the offset is estimated also, or the high-pass filters are used. However, 
the high-pass filter solution gets less disturbed in the f jerk ! phase of the set point, and is 
therefore preferable. 

[000120] Figure 6 illustrates the mass and offset estimation as discussed above. It can 
be seen that, for example during the acceleration phase from 2.1 to 2.2 sec, both the 
mass and the offset are adjusted. This is caused by the fact that in the constant- 
acceleration region, the least-squares estimator cannot distinguish between an offset 
and a gain (both a higher offset and a higher mass could be the cause for a required 
higher force). This is a typical example of a too low 'persistent excitation 1 , which more 
or less means that there is not enough frequency content in the signals to create a 
correct parameter update. 

[000121] A similar problem exists during the constant-velocity phase, wherein the 
nominal controller force is zero, and so is the stage acceleration. In such a region, noise 
is the main contributor to the signal contents. The least-squares method reacts to such a 
condition by increasing its adaptation gain r . To avoid r to get out of bound, 
adaptation is switched off when the set point acceleration becomes zero. Another 
alternative would be to limit the trace of r . 

[000122] Note that the requirement on persistent excitation becomes more severe 
when the number of parameters increases (which, again, is clearly illustrated with the 
offset estimation example). 

[000123] The previous paragraphs already showed some mass-estimation results. 
However, the estimated mass was not yet made active in the feed-forward path. This 
paragraph describes some results when the mass estimation is made active in the feed- 
forward. 
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[000124] Figure 7 shows an example, obtained by repeatedly moving the wafer stage 
in Y direction from -150 to +150 mm and back. The plots start with two negative 
acceleration phases: one to decelerate from +0.9 m/s to zero velocity, one to accelerate 
to -0.9 m/s. The end of the plot shows the deceleration to zero again, followed by 
acceleration to positive velocity. This implies that the 'left 1 side of the plot lies around 
+150 mm, while the right side of the plot is around -150 mm. 

[000125] The top window of Figure 7 shows the servo error without mass-estimation 
feed-forward. It can be seen that at the left side of the plot the peak error is about 62 
nm, while at the right side the error is about 44 nm. Hence, the controller error is 
position-dependent, and is smaller at Y— 1 50 mm. 

[000126] It can be seen that at the end of the 2 nd negative acceleration phase (around 
t=1.53 sec), the servo error obtained with and without mass-estimation feed-forward 
(middle window and upper window of Figure 7, respectively) is the same. Strikingly, at 
this point the mass estimation roughly equals the nominal value (bottom window of 
Figure 7). At the right side of the plot, the mass estimation increases during the two 
positive acceleration phases. It can be seen that also the servo error increases to the 
original value of 65 nm. 

[000127] When further inspecting the plots of Figure 7, it will be observed that the 
servo error is smaller when the estimated mass is smaller, and hence a smaller 
acceleration feed-forward is present. At the right side of the plot, it is advantageous to 
use the nominal value, because the mass estimation produces a 20 g higher mass feed- 
forward. In areas where the mass estimation is smaller than nominal, the servo error is 
smaller when using this estimated mass. The main conclusion here is that the usage of 
the nominal mass in the feed-forward is not optimal: a slightly smaller value improves 
the servo error. 
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Second Embodiment 

[000128] The drift-like behavior in mass estimation during the acceleration phase 
could be caused by the fact that other disturbances influence the mass estimation. 
Candidates are, for example, the absence of velocity feed- forward in the control loop. 
Other candidates are jerk (derivative of acceleration) and snap (derivative of jerk) feed- 
forward. Because no compensation exists for disturbances in velocity, jerk and snap, 
the estimator 'pushes' all these effects into the mass estimation. 

[000129] To check whether this is actually the case, combinations of estimations were 
tested using the input/output traces as discussed above: 

[000130] Mass estimation only 

[000131] Estimation of mass and velocity feed-forward 

[000132] Estimation of mass, velocity and jerk feed-forward 

[000133] Estimation of mass, velocity, jerk and snap feed-forward 

[000134] To be able to estimate more parameters, the signal vector and parameter 
vector need to be extended. For this purpose, jerk, snap and velocity must be created by 
gradual differentiation of the position of the wafer stage 12. Because each digital 
differentiation introduces 0.5 samples delay, the various signals must be delayed such 
that at the end they all have the same delay. The delay in the force signal must match 
this total delay. Figure 8 shows the creation of the signal vector and the place of the 
respective parameters d , m , e and g , where: 

[000135] d = a velocity coefficient; 
[000136] m = mass; 
[000137] e = ajerk coefficient; 
[000138] g = a snap coefficient. 



30428037-1 



-23- 



Docket No.; P-1520.010-EP 



[000139] Figure 8 shows a first differentiator 28, a second differentiator 30, a third 
differentiator 32, and a fourth differentiator 34 connected in series. An actual position 
signal is received from the wafer stage 12 and input to the first differentiator 28. Thus, 
an actual velocity signal, an actual acceleration signal, an actual jerk signal and an 
actual snap signal, respectively, are present at the outputs of the first differentiator 28, 
the second differentiator 30, the third differentiator 32, and the fourth differentiator 34, 
respectively. The actual velocity signal is multiplied by estimated velocity coefficient 

d in multiplier 36 and then delayed by 1,5 time period by a delay unit 44. The actual 
acceleration signal is multiplied by estimated mass m in multiplier 24 and then delayed 
by one time period by a delay unit 46. The actual jerk signal is multiplied by estimated 
jerk coefficient e in multiplier 40 and delayed by 0,5 time period by a delay unit 50. 
The actual snap signal is multiplied by estimated snap coefficient g in multiplier 42. 
The outputs of the delay units 44, 46, 50 and of the multiplier 42 are shown to be added 
by adder units 52, 54, 56 to render an estimated force signal to subtraction unit 26. 

[000140] Note that the estimated parameters need not all be used in the feed-forward, 
but could have the only goal of making the mass estimation more stable. 



[000141] The fact that jerk and snap feed-forwards could be required stems from the 
fact that the process is not only represented by a mass but also has higher-order 
dynamics. A first possibility is that the process is described by a mass plus one 
resonance frequency, yielding the following equation for a movement x of the wafer 
stage 1 2 as a reaction to a force F: 



\ 



[000142] 



1 



1 



1 




[000143] 



Adding a friction term yields: 



[000144] 



x 1 



F gs 4 + es 3 + ms 2 + ds 



[000145] 



The correct feed-forward for such a process would look like: 
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[000146] F = (gs 4 + es 3 + ws 2 + rfsjr^ 

[000147] Where: xspg = the set point position as generated by the set point position 
generator. 

[000148] In addition to the acceleration feed-forward (ms 2 ), the velocity (ds), jerk (es 3 ) 
and snap (gs 4 ) feed-forwards are clearly recognised. 

[000149] Figure 9 shows the mass estimation result under the conditions mentioned 
above. When only the mass m is estimated, the typical rise in mass during the 
acceleration phase is observed. If the velocity coefficient d is adjusted as well, the mass 
estimation becomes somewhat more stable. The mass estimation now drifts downwards 
during the acceleration. When, additionally, the jerk coefficient e is estimated, this 
result does not change significantly. However, when also the snap coefficient g is 
estimated, the mass estimation becomes the most stable. 

[000150] Figure 10 shows the estimation of all four parameters. It can be seen that 
especially the snap adjusts stably to 2.7e-7 Ns 3 /m, but the other parameters are 
'disturbed* during the jerk phase. Also, the jerk coefficient was expected to be zero 
because no time lag between the measured acceleration and force should be present any 
more (a jerk coefficient means that a constant force is required during the jerk phase, 
which is also the case when the acceleration feed-forward timing is not correct; hence 
the presence of jerk feed- forward indicates a timing problem). 

[000151] Now, a test was performed, as follows. In three different X positions (-150 
mm, 0, +150 mm), repeated Y-movements of +/- 150 mm were performed. During 
each acceleration/deceleration part, the estimated mass (when using the combined 
velocity/mass/jerk/snap estimation as described above) is recorded by using the average 
of the last 100 points (20 msec) at the end of each acceleration/deceleration part (hence, 
before the jerk phase starts!). This yields an 'estimated mass' at six points within the 
wafer stage field. The estimated mass is summarized in the following tables. Note that 
the nominal mass, as calibrated in the feed-forward calibration, equals 22.667 kg. 
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Table 1: Estimated Mass When Velocity, Acceleration, Jerk & Snap Feed-Forwards 
Are Also Estimated 



Y X 


-150 mm 


0 


+150 mm 


-150 mm 


22.662 


22.654 


22.657 


+150 mm 


22.682 


22.675 


22.675 


Table 2: Estimated Mass When Only The Mass Is Estimated 


Y X 


-150 mm 


0 


+150 mm 


-150 mm 


22.674 


22.667 


22.666 


+150 mm 


22.696 


22.688 


22.688 



[000152] In all X positions, the controller error varies between 40 and 60 nm when 
no feed-forward adjustment is performed. The error is 60 nm in all positions when 
feed-forward mass estimation is switched on. In Table 2 it can be seen that at the 
positions where the servo error is the same, the estimated mass matches the nominal 
mass. In the locations where the servo error is smaller when no feed- forward 
adjustment takes place, the estimated mass is higher, and hence the original feed- 
forward is actually smaller than required. Evidently, a slightly (20 g) too small 
acceleration feed- forward by itself reduces the servo error. 

Third Embodiment 

[000153] Estimating many parameters simultaneously may not be the perfect solution 
for all situations for various reasons: 

[000154] Matrix calculations (r !) become complicated and use a lot of calculation 
time. 
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[000155] The excitation must be persistent enough, which differs with the signal 
type: the mass (=acceleration feed-forward) should be estimated only when the 
acceleration is sufficiently large, therefore the estimation is switched off when the 
acceleration set point is smaller than some value. However, the jerk feed-forward 
estimation requires a sufficient jerk, the snap estimation requires a sufficient snap, and 
the velocity feed-forward estimation requires a sufficient velocity. Hence, the 
estimation of the different parameters should be switched on during different phases of 
the trajectory, which is impossible when using the least-squares method. 

[000156] Not all parameters may be time-varying, focus should be placed on those 
parameters that change. 

[000157] On the other hand, as observed in the previous paragraphs, the mass 
estimation appears to be disturbed by the fact that the process not only consists of a 
mass but also includes higher-order dynamics (hence the estimation of the other 3 
parameters). Assuming the snap feed-forward takes away the higher-order dynamics 
from the mechanics, the mass estimator should now be connected to the control force 
minus the snap component, as shown in Figure 1 1 where the adder 10 receives a further 
input signal from a multiplier 58. The multiplier 58 multiplies a received snap set point 
signal by a feed- forward snap coefficient gff. The mass estimation block 18 now 
receives a force signal from the output of Notch 2 block 8, i.e. excluding the snap feed- 
forward component from multiplier 58. Because the snap feed- forward compensates the 
higher dynamics of the wafer stage, the relation between stage acceleration and the 
output of Notch 2 better resembles a mass. 

[000158] The following plots show preliminary results. Figure 12 shows the 
controller error without and with snap feed-forward, while the mass estimation is 
switched on. This mass estimation is shown in Figure 13. 

[000159] It can be seen that the mass estimation is not influenced by the presence of 
snap feed-forward. The snap feed-forward in this particular test by itself reduces the 
servo error about a factor of 2 (from 60 to 30 nm peak). 
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[000160] Inspecting the controller output, it was observed that the required output in 
standstill differed as much as 0.4 N in the extreme Y-positions. Note that the high-pass 
filters that should remove the offset were at a frequency of 1 Hz, while a complete 
move only takes about 0.32 sec. Hence, the high-pass filters may be set at a too low 
frequency. To test this, an experiment was performed by using a 10 Hz corner 
frequency for the high-pass filters, resulting in the controller errors of Figure 14 and 
mass estimation of Figure 15 (without and with snap feed- forward). Note that also a 
velocity feed-forward of 0.22 Ns/m was used. The plots show that at the same location, 
the estimator estimates a 20 g different mass, dependent on the movement direction 
(even when the sign of the acceleration is the same). This phenomenon is caused by 
nonlinear behavior of the amplifiers. 

[000161] Although the mass estimation has now become considerably faster, it can 
be seen that during the jerk phase quite a large disturbance remains visible. This 
appears to be caused by a remaining difference in timing between the generated 
acceleration and force signals. By decreasing the force delay from 2.35 to 2.25 samples, 
the mass estimation becomes more stable, as indicated in Figure 16. 

[000162] With the knowledge gained in the previous paragraphs, a new test was 
performed using the following conditions: 
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snap feed-forward gain 


3.4e-7 Ns J /ms (injection after notch2) 


snap filter frequency & damping 


700 Hz, d=0.7 


snap delay correction 


400e-6 sec 


Mass estimation high-pass filters 


10 Hz 


IVldoo Co 111110.11^11 vl\sl<Xy III lUiL-C 

path 


aalllJJICo, 1UIV/C CALiaL/LCLl UC1UIC 

snap injection 


Mass estimation forffettint? factor 


0.995 


Velocity feed-forward 


0.22 Ns/m expose stage (WT), 0 Ns/m 
measure stage (MT) 


Nominal mass feed-forward as 
calibrated earlier 


expose: 22.652 kg, measure: 22.601 kg 



[000163] Combinations with and without snap feed-forward and mass estimation 
were performed, both on the expose stage WT and the measure stage MT. It appears 
that in this case at the measure stage MT the nominal feed-forward does not match very 
well. Each test was done 6 times: on 3 X-positions (-150 mm, 0, +150 mm), moves 
having both a positive and a negative direction were performed. For the centre position 
(X=0), the following four plots in Figures 17, 18, 19, and 20, respectively, show the 
results, including the estimated mass. In each plot, also the peak servo error after the 
acceleration phase is indicated. In each of the Figures 17, 18, 19, and 20, the top left 
plot shows the original situation. The middle left plot shows the effect of mass 
estimation, with below it the estimated mass. The top right plot shows the result with 
snap feed-forward only. The middle right plot shows the result with mass estimation 
and snap feed-forward, with below it the estimated mass. 

[000164] The peak controller errors for the expose stage WT are summarized in the 
following tables. These are ordered according to the position in the field where the 
measurement was done, and because both Y=+150 and Y— 150 are present in each test 
file, two values are present. 
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Table 3: Peak Controller Error [Nm], Original Condition 



Y X 


-150 mm 


0 


+150 mm 


+150 mm 


43.4 


40.5 


35.1 


+150 mm 


42.6 


43.3 


40.7 


—150 mm 


35.8 


25.8 


30.8 


-150 mm 


29.8 


31.8 


36.0 



Table 4: Peak Controller Error [Nm], Mass Estimation 



Y X 


-150 mm 


0 


+150 mm 


+150 mm 


20.4 


18.0 


19.9 


+150 mm 


22.4 


22.0 


25.1 


-150 mm 


27.1 


21.1 


20.8 


-150 mm 


23.0 


21.2 


22.4 



Table 5: Peak Controller Error [Nm], Snap Feed-Forward 



Y X 


-150 mm 


0 


+150 mm 


+150 mm 


23.0 


19.3 


13.0 


+150 mm 


20.2 


23.4 


12.7 


-150 mm 


8.7 


13.7 


12.0 


-150 mm 


11.1 


10.0 


12.2 
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Table 6: Peak Controller Error [Nm], Mass Estimation And Snap Feed-Forward 



Y X 


—150 mm 


0 


+150 mm 


+150 mm 


14.1 


15.8 


14.8 


+150 mm 


10.5 


13.6 


16.2 


—150 mm 


16.3 


14.2 


11.4 


—150 mm 


11.1 


11.2 


12.8 



[000165] Regarding these experiments, the following conclusions can be drawn: 

[000166] For the measure stage MT, about 60 g mismatch exists between the feed- 
forward mass and the estimated mass. The effect of the mass estimator is that the peak 
servo error decreases from more than 100 nm to about 35 nm. 

[000167] At the expose stage WT, the mass estimation by itself also improves the 
controller error considerably (peak error decreases from 43 to 27 nm). The original 
error of 43 nm is relatively high due to non-exact mass calibration. 

[000168] When snap feed-forward is used, the mass estimator yields a slightly less 
varying mass than without snap feed- forward. 

[000169] When snap feed- forward is used, the peak controller error in the wafer stage 
field becomes more constant. The peak error decreases from 23 nm to 16 nm. Note that 
the snap feed-forward gain and timing was not tuned, and the estimated snap indicates a 
smaller value than used in the machine. Some room for improvement appears to be 
present. 

[000170] During the jerk phase, the mass estimator is rapidly varying. This can be 
improved by switching on the estimator only when the maximum acceleration is 
reached (up to now, it is active whenever the acceleration is nonzero). In that case, the 
estimator is very constant at the end of an acceleration phase, but has to settle more 
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during the start of the acceleration phase. This is shown in Figure 21. The effect in the 
machine is tested in the following paragraph. 

[000171] The same test was performed as discussed above, with the only change that 
the mass estimation is only active when the set point acceleration has reached its 
maximum. Hence, no adjustment takes place any more during the jerk phase, for the 
reasons mentioned above. The results summarized for the expose stage WT are listed 
in the tables. Table 11 summarizes results for the four combinations of snap feed- 
forward and mass estimation. It can be seen that the combination of snap feed-forward 
and mass estimation has performed slightly better than in the first test. Apparently, a 
constant feed-forward mass during the end of the acceleration phase improves the 
maximum error. In the plots, note that when using mass estimation only, the servo error 
is always smaller when the estimated mass is smaller than the nominal value, as 
observed earlier. This is no longer true if also snap feed- forward is used. 

Table 7: Peak Controller Error [Nm], No Mass Estimation Or Snap Feed-Forward 



Y X 


-150 mm 


0 


+150 mm 


+150 mm 


28.0 


23.9 


17.8 


+150 mm 


30.6 


18.1 


19.7 


-150 mm 


13.8 


19.3 


17.7 


—150 mm 


16.1 


20.3 


13.3 
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Table 8: Peak Controller Error [Nm], Mass Estimation 



Y X 


-150 mm 


0 


+150 mm 


+150 mm 


24.1 


20.4 


23.4 


+150 mm 


25.9 


25.4 


24.0 


-150 mm 


27.9 


25.6 


23.0 


-150 mm 


25.7 


24.3 


24.7 



Table 9: Peak Controller Error [Nm], Snap Feed-Forward 



Y X 


—150 mm 


0 


+150 mm 


+150 mm 


10.9 


15.0 


19.0 


+150 mm 


13.6 


19.9 


13.7 


-150 mm 


21.0 


28.0 


25.8 


-150 mm 


18.8 


26.6 


21.7 



Table JO: Peak Controller Error [Nm], Mass Estimation And Snap Feed-Forward 



Y X 


-150 mm 


0 


+150 mm 


+150 mm 


11.4 


14.1 


11.8 


+150 mm 


9.9 


13.9 


9.7 


-150 mm 


13.1 


10.4 


10.8 


-150 mm 


10.9 


10.9 


11.1 
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Table 11: Peak Position Errors [Nm], Summarized 



X 


What? V— ► 


—150 mm 


0 


+150 mm 




lln m nol 

^/nginai 


Zo.U 




1 /.O 


Mass estimation 


24.1 


20.4 


23.4 


Snap FF 


10.9 


15.0 


19.0 


Mass est + snap FF 


11.4 


14.1 


11.8 


-150 


Original 


13.8 


19.3 


17.7 


Mass estimation 


27.9 


25.6 


23.0 


Snap FF 


21.0 


28.0 


25.8 


Mass est + snap FF 


13.1 


10.4 


10.8 



Fourth Embodiment 

[000172] An alternative implementation was developed for the case when only 1 
parameter is estimated, as is true in the case of mass estimation. 

[000173] In the case of mass estimation, the non-recursive least-squares method 
attempts to find the optimal estimated mass m in the equation: 



[000174] 



"a," 




V," 






fl 




m = 




. a »_ 




fn. 



Am = F 



[000175] Here, a,- (/ = 1,2, . . ., n) is an acceleration sample (a„ is the most recent 
sample), while /) (i = 1, 2, n) is a control force sample (f„ being the most recent 
one). This can be written into: 

A T Am = A T F 
[000176] m±( ai ) 2 =± 

1 = 1 1=1 



* /=i 

m = — 
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[000177] Hence, the least-squares estimate can be found by the filter implementation 
shown in Figure 22. 

[000178] This form does not support using a forgetting factor A yet. Implementing 
this factor can be done as follows: 

m±r- i {a i y=±A'-'(a r f i ) 

[000179] 5>" w (a,-/ l ) 

m =— 

i=i 

[000180] The matching filter implementation then looks like that in Figure 23. This 
alternative implementation has a simpler form than the original least-squares 
implementation, which involved two recursion equations. 

[000181] In the case of offset estimation, the used model would look like: 

[000182] F = ma + d 

[000183] where: d = estimated offset. 

[000184] When the offset is estimated by itself (not simultaneous with the mass), 
effectively a signal vector is used which is a constant of 1. Using the structure of Figure 
23, a x is replaced by 1. The top filter is then fed with the control force, while the 
bottom filter is fed by an input of 1. It can be easily calculated that the bottom filter 
settles to a value of J/^ _ ^ . This fixed value can then be used instead of the output of 

the bottom filter, yielding the structure of Figure 24. 
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Fifth Embodiment 

[000185] In the embodiments discussed above, the one and only estimated parameter 
that was used in the feed-forward, was the mass. This mass actually serves as the 
simplest form of 'inverse process dynamics': it is the inverse of the transfer from force 
to acceleration. The additional snap feed-forward actually serves as a better 'inverse 
process dynamics' by including one (zero-damping) resonance. 

[000186] An alternative feed- forward would be a higher-order model of the inverse 
process. One way to do this is to estimate a model of the process, inverse this model, 
and use this inverse model as a filter in the acceleration feed-forward path. This 
method, however, has some drawbacks. First, the process usually has a high order of 
gain roll-off for higher frequencies, which translates into a strongly rising frequency 
characteristic of the process inverse. Furthermore, non-minimum phase zero's in the 
process transfer function translate into unstable poles in the inverse. This especially 
poses a problem in the discrete domain, where non-minimum phase zero's are common. 

[000187] The solution proposed here is to extend the mass estimator to an 'inverse 
process dynamics' estimator. The estimator then estimates the transfer function from 
measured acceleration to the applied force, rather than estimating the transfer function 
from the applied force to the acceleration, and inverting this estimated transfer 
function. This way, a transfer function estimate of the inverse dynamics will result that 
is optimal in a least-squares sense. By choosing, for example, a FIR filter architecture, 
stability of the estimate is guaranteed. 

[000188] Figure 33 shows the basic architecture for this alternative approach. The 
architecture is similar to the one shown in Figure 3, and like reference numbers refer to 
the same components. However, the mass estimation unit 18 of Figure 3 has being 
generalised into a feed-forward (FF) filter estimation unit 60. Moreover, the multiplier 
14 of Figure 3 has been changed into a transfer function unit 62, arranged to apply a 
transfer function H ff to the set point acceleration. 
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[000189] The difference between the architectures of Figure 3 and 33 is that no 
longer a mass is estimated but the relation between acceleration and force. As may be 
evident to a person skilled in the art, if the wafer stage 12 (or any other mass to be 
controlled) performs as a "rigid body", then the architecture of Figure 30 reduces to the 
one of Figure 3 since then the transfer function H ff is the same as multiplying by mass 
mff. The difference between the two architectures is important when there are dynamics 
in the wafer stage 12. 

[000190] The feed-forward filter estimation unit 60 determines an estimated transfer 
function H es t from the measured acceleration to the applied force F. In the transfer 
function unit 62 this estimated transfer function H es t is used, together with the set point 
acceleration to provide an estimated input force. This estimated input force is deducted 
from the real input force and the difference is used by the least square's mechanism to 
produce a new estimated transfer function. 

[000191] The most general structure of such an inverse process dynamics estimator 
that can be used is the ARX structure, as shown in Figure 25. 

[000192] In general terms, the transfer function of this structure is: 



[000194] Hence, the signal vector co(k) and parameter vector 0{k) are defined by: 
rnftA1 Afl a>(k) = [~y(k - 1), -y(k-l),...,- y(k - n\ u(k), u(k - 1), . . . , u(k - m)] 



[000196] Here, the input u is formed by the measured acceleration. The output y 
then represents the estimated input force, which is compared with the real input force to 
create the estimation error. 

[000197] Another architecture that can be used is a FIR filter. The advantage of the 
FIR filter is that it cannot become unstable. The architecture is shown in Figure 26. 



[000193] 



y(k) = -a } y(k - 1) - a 2 y(k - 2) a n y{k - n) + 

+ b 0 u(k) + b x u(k -!) + ••• + b m u(k - m) 



[000195] 
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[000198] The FIR filter recursion equation is: 
[000199] y(k) = b 0 u(k) + t\u{k - 1) + • • • + b m u(k - m) 

[000200] Consequently, the definition of the signal vector and parameter vector are: 



[000202] Again, the input u(k) is formed by the measured acceleration and the filter 
output y(k) represents the estimated input force, used to create the estimation error by 
subtracting it from the actual input force. 

[000203] Figure 27 shows the estimated transfer function for a large number of 
estimated parameters for both the FIR and the ARX filter. The FIR filter has 20 FIR 
taps (21 parameters), whereas the ARX filter is of the 10th order (21 parameters). The 
resemblance between the FIR and ARX transfer functions is striking. Figure 28 shows 
the resulting feed-forward force for both filters. The "overshoot" shows a striking 
resemblance with snap feed-forward. Figure 29 shows the estimated mass, which 
equals the DC gain of each of the resulting filters. Figure 30 and Figure 31 show the 
results for a much lower filter order. Figure 30 shows estimated transfer functions for a 
FIR filter with 4 taps (5 parameters) and a 2nd order ARX filter (5 parameters), 
respectively. Figure 3 1 shows the feed- forward force in both situations. The resulting 
feed-forward is not very different from the one associated with Figures 27-29 but the 
resulting estimated mass, shown in Figure 32, now has become much less stable in the 
ARX case. 



[000201] 



{k)=[u(klu(k-ll... 9 u(k-m)] 
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